Skip to content

feat: enhance parseRemoteAuthority to handle domains containing "--"#917

Open
ntoofu wants to merge 1 commit intocoder:mainfrom
ntoofu:domain-with-punycode
Open

feat: enhance parseRemoteAuthority to handle domains containing "--"#917
ntoofu wants to merge 1 commit intocoder:mainfrom
ntoofu:domain-with-punycode

Conversation

@ntoofu
Copy link
Copy Markdown

@ntoofu ntoofu commented Apr 23, 2026

Problem

As explained in CONTRIBUTING.md,

The host name takes the format coder-vscode.----.

The parser splits on "--" to extract each component, but domain labels may contain "--".
In practice, IDNA/Punycode-encoded domain labels use xn-- as the ACE prefix.
So the naive split("--") misparses hostnames with such labels, and causes errors like the following image.

Screenshot 2026-04-23 02-50-34

For example, a Coder deployment coder.xn--eckwd4c7cu47r2wf.jp would produce an SSH host of coder-vscode.coder.xn--eckwd4c7cu47r2wf.jp--yourname--your-workspace.
When split on "--", this yields ["coder-vscode.coder.xn", "eckwd4c7cu47r2wf.jp", "yourname", "your-workspace"] but the parser expects the first element to be the full hostname, so vscode-coder would try to open a workspace yourname whose owner is eckwd4c7cu47r2wf.jp and fails.

Solution

Added a reassembly step after the initial split("--").
The algorithm scans from the front of the split result:

  • The first segment is always a part of a hostname.
  • If the second segment contains a dot, that must be a fragmented part of a hostname. So merge those two segments back with "--".
  • Because more than one domain labels may contain "--", repeat this merging process until a dot does not appear in the second segment.

This works because Coder usernames never contain dots,
so any dot-bearing segment after the split must be a hostname fragment.

Note: a TLD containing "--" would not be handled correctly because such hostname would fragment into segments the last of which does not contain dot, but no such TLD exists today.

@EhabY EhabY self-assigned this Apr 27, 2026
@EhabY
Copy link
Copy Markdown
Collaborator

EhabY commented Apr 27, 2026

Nice fix for the punycode case. Before this lands, one realistic case still misparses, and a slightly simpler shape closes it.

Where can -- appear in a hostname?

Only two places:

  1. As a schema separator (coder-vscode.<host>--<owner>--<workspace>--<agent>).
  2. Inside a punycode label (xn--<encoded>) somewhere in the domain.

Coder's UsernameValidRegex forbids -- in usernames, workspaces, and agents, and xn-- is the only IDN ACE prefix IANA has allocated. That's the whole space.

Apex punycode still misparses

A deployment served at the apex of an internationalized domain (e.g. https://xn--p1ai/) produces coder-vscode.xn--p1ai--owner--ws. Splitting gives ["coder-vscode.xn", "p1ai", "owner", "ws"]; parts[1] is p1ai with no dot, so the merge loop exits immediately and p1ai becomes the username. Rare but real.

Simpler parser

Scan forward, growing the prefix one segment at a time, and accept the first boundary where the prefix is valid and the remaining tail is a valid <owner>--<workspace>[--<agent>] or <owner>--<workspace>[.<agent>]. A valid prefix is coder-vscode or coder-vscode.<...> and does not end in .xn (which is the only way -- can legally appear inside a domain, so ending in .xn means we cut a punycode label).

const SLUG = /^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$/;

function isValidPrefix(prefix: string): boolean {
  if (prefix === AuthorityPrefix) return true;
  return (
    prefix.startsWith(`${AuthorityPrefix}.`) && !prefix.endsWith(".xn")
  );
}

export function parseRemoteAuthority(authority: string): AuthorityParts | null {
  const sshHost = authority.split("+")[1] ?? "";
  if (
    sshHost !== AuthorityPrefix &&
    !sshHost.startsWith(`${AuthorityPrefix}--`) &&
    !sshHost.startsWith(`${AuthorityPrefix}.`)
  ) {
    return null;
  }

  const parts = sshHost.split("--");

  for (let i = 1; i <= parts.length - 2; i++) {
    const prefix = parts.slice(0, i).join("--");
    if (!isValidPrefix(prefix)) continue;
    const tail = parts.slice(i);

    // 4-part: <prefix>--<owner>--<workspace>--<agent>
    if (tail.length === 3 && tail.every((p) => SLUG.test(p))) {
      return build(sshHost, prefix, tail[0], tail[1], tail[2]);
    }

    // 3-part: <prefix>--<owner>--<workspace>[.<agent>]
    if (tail.length === 2 && SLUG.test(tail[0])) {
      const [workspace, agent = ""] = tail[1].split(/\.(.+)/);
      if (SLUG.test(workspace) && (agent === "" || SLUG.test(agent))) {
        return build(sshHost, prefix, tail[0], workspace, agent);
      }
    }
  }

  throw new Error("Invalid Coder SSH authority");
}

Every existing case parses identically

Input Result
coder-vscode--foo--bar 3-part, prefix coder-vscode
coder-vscode--foo--bar--baz 4-part, agent baz
coder-vscode.dev.coder.com--foo--bar 3-part
coder-vscode.dev.coder.com--foo--bar--baz 4-part wins (prefix valid, no .xn)
coder-vscode.dev.coder.com--foo--bar.baz 3-part with dot-agent
coder-vscode.coder.xn--eckwd4c7cu47r2wf.jp--foo--bar (this PR's case) i=1 prefix invalid (ends .xn); i=2 accepts
coder-vscode.xn--p1ai--owner--ws (apex punycode, currently broken) i=1 prefix invalid (ends .xn); i=2 accepts
coder-vscode.xn--abc.xn--def.com--owner--ws (multiple punycode labels) i=1 and i=2 invalid; i=3 accepts

False negatives

The only input this can reject incorrectly is a deployment whose domain ends in a label literally named xn (e.g. something.xn). IANA reserves the xn-- space, url.domainToASCII does not produce bare xn followed by separator content, and no real TLD or registrable label of xn alone exists. So the failure mode is a label that essentially cannot legally be reached.

Side benefit

The slug regex also rejects empty names and names ending in - that the current parser passes through, so input validation tightens up at the same time.

Happy to push this as a follow-up commit on this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants